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Biotechnology - Cloning and Genetic Engineering (GPC) 
By the end of this section, you will be able to: 


e Explain the basic techniques used to manipulate genetic material 
e Explain molecular and reproductive cloning 


Biotechnology is the use of artificial methods to modify the genetic 
material of living organisms or cells to produce novel compounds or to 
perform new functions. Biotechnology has been used for improving 
livestock and crops since the beginning of agriculture through selective 
breeding. Since the discovery of the structure of DNA in 1953, and 
particularly since the development of tools and methods to manipulate 
DNA in the 1970s, biotechnology has become synonymous with the 
manipulation of organisms’ DNA at the molecular level. The primary 
applications of this technology are in medicine (for the production of 
vaccines and antibiotics) and in agriculture (for the genetic modification of 
crops). Biotechnology also has many industrial applications, such as 
fermentation, the treatment of oil spills, and the production of biofuels, as 
well as many household applications such as the use of enzymes in laundry 
detergent. 


Manipulating Genetic Material 


To accomplish the applications described above, biotechnologists must be 
able to extract, manipulate, and analyze nucleic acids. 


Review of Nucleic Acid Structure 


To understand the basic techniques used to work with nucleic acids, 
remember that nucleic acids are macromolecules made of nucleotides (a 
sugar, a phosphate, and a nitrogenous base). The phosphate groups on these 
molecules each have a net negative charge. An entire set of DNA molecules 
in the nucleus of eukaryotic organisms is called the genome. DNA has two 
complementary strands linked by hydrogen bonds between the paired bases. 


Unlike DNA in eukaryotic cells, RNA molecules leave the nucleus. 
Messenger RNA (mRNA) is analyzed most frequently because it represents 
the protein-coding genes that are being expressed in the cell. 


Isolation of Nucleic Acids 


To study or manipulate nucleic acids, the DNA must first be extracted from 
cells. Various techniques are used to extract different types of DNA ((link]). 
Most nucleic acid extraction techniques involve steps to break open the cell, 
and then the use of enzymatic reactions to destroy all undesired 
macromolecules. Cells are broken open using a detergent solution 
containing buffering compounds. To prevent degradation and 
contamination, macromolecules such as proteins and RNA are inactivated 
using enzymes. The DNA is then brought out of solution using alcohol. The 
resulting DNA, because it is made up of long polymers, forms a gelatinous 
mass. 
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This diagram shows the basic method used for the 
extraction of DNA. 


RNA is studied to understand gene expression patterns in cells. RNA is 
naturally very unstable because enzymes that break down RNA are 
commonly present in nature. Some are even secreted by our own skin and 
are very difficult to inactivate. Similar to DNA extraction, RNA extraction 
involves the use of various buffers and enzymes to inactivate other 
macromolecules and preserve only the RNA. 


Gel Electrophoresis 


Because nucleic acids are negatively charged ions at neutral or alkaline pH 
in an aqueous environment, they can be moved by an electric field. Gel 
electrophoresis is a technique used to separate charged molecules on the 
basis of size and charge. The nucleic acids can be separated as whole 
chromosomes or as fragments. The nucleic acids are loaded into a slot at 
one end of a gel matrix, an electric current is applied, and negatively 
charged molecules are pulled toward the opposite end of the gel (the end 
with the positive electrode). Smaller molecules move through the pores in 
the gel faster than larger molecules; this difference in the rate of migration 
separates the fragments on the basis of size. The nucleic acids in a gel 
matrix are invisible until they are stained with a compound that allows them 
to be seen, such as a dye. Distinct fragments of nucleic acids appear as 
bands at specific distances from the top of the gel (the negative electrode 
end) that are based on their size ({link]). A mixture of many fragments of 
varying sizes appear as a long smear, whereas uncut genomic DNA is 
usually too large to run through the gel and forms a single large band at the 
top of the gel. 


Larger fragments 


Smaller fragments 


Shown are DNA fragments from six samples 
run on a gel, stained with a fluorescent dye 
and viewed under UV light. (credit: 
modification of work by James Jacob, 
Tompkins Cortland Community College) 


Polymerase Chain Reaction 


DNA analysis often requires focusing on one or more specific regions of 
the genome. It also frequently involves situations in which only one or a 
few copies of a DNA molecule are available for further analysis. These 
amounts are insufficient for most procedures, such as gel electrophoresis. 
Polymerase chain reaction (PCR) is a technique used to rapidly increase 
the number of copies of specific regions of DNA for further analyses 
({link]). PCR uses a special form of DNA polymerase, the enzyme that 


replicates DNA, and other short nucleotide sequences called primers that 
base pair to a specific portion of the DNA being replicated. PCR is used for 
many purposes in laboratories. These include: 1) the identification of the 
owner of a DNA sample left at a crime scene; 2) paternity analysis; 3) the 
comparison of small amounts of ancient DNA with modern organisms; and 
4) determining the sequence of nucleotides in a specific region. 
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Polymerase chain reaction, or 
PCR, is used to produce many 
copies of a specific sequence of 


DNA using a special form of DNA 
polymerase. 


Cloning 


In general, cloning means the creation of a perfect replica. Typically, the 
word is used to describe the creation of a genetically identical copy. In 
biology, the re-creation of a whole organism is referred to as “reproductive 
cloning.” Long before attempts were made to clone an entire organism, 
researchers learned how to copy short stretches of DNA—a process that is 
referred to as molecular cloning. 


Molecular Cloning 


Cloning allows for the creation of multiple copies of genes, expression of 
genes, and study of specific genes. To get the DNA fragment into a 
bacterial cell in a form that will be copied or expressed, the fragment is first 
inserted into a plasmid. A plasmid (also called a vector in this context) is a 
small circular DNA molecule that replicates independently of the 
chromosomal DNA in bacteria. In cloning, the plasmid molecules can be 
used to provide a "vehicle" in which to insert a desired DNA fragment. 
Modified plasmids are usually reintroduced into a bacterial host for 
replication. As the bacteria divide, they copy their own DNA (including the 
plasmids). The inserted DNA fragment is copied along with the rest of the 
bacterial DNA. In a bacterial cell, the fragment of DNA from the human 
genome (or another organism that is being studied) is referred to as foreign 
DNA to differentiate it from the DNA of the bacterium (the host DNA). 


Plasmids occur naturally in bacterial populations (such as Escherichia coli) 
and have genes that can contribute favorable traits to the organism, such as 
antibiotic resistance (the ability to be unaffected by antibiotics). Plasmids 
have been highly engineered as vectors for molecular cloning and for the 
subsequent large-scale production of important molecules, such as insulin. 


A valuable characteristic of plasmid vectors is the ease with which a foreign 
DNA fragment can be introduced. These plasmid vectors contain many 
short DNA sequences that can be cut with different commonly available 
restriction enzymes. Restriction enzymes (also called restriction 
endonucleases) recognize specific DNA sequences and cut them in a 
predictable manner; they are naturally produced by bacteria as a defense 
mechanism against foreign DNA. Many restriction enzymes make 
staggered cuts in the two strands of DNA, such that the cut ends have a 2- 
to 4-nucleotide single-stranded overhang. The sequence that is recognized 
by the restriction enzyme is a four- to eight-nucleotide sequence that is a 
palindrome. Like with a word palindrome, this means the sequence reads 
the same forward and backward. In most cases, the sequence reads the same 
forward on one strand and backward on the complementary strand. When a 
staggered cut is made in a sequence like this, the overhangs are 
complementary ((link]). 


Sticky ends 
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In this (a) six-nucleotide restriction enzyme recognition site, 
notice that the sequence of six nucleotides reads the same in 
the 5' to 3' direction on one strand as it does in the 5' to 3' 
direction on the complementary strand. This is known as a 
palindrome. (b) The restriction enzyme makes breaks in the 
DNA strands, and (c) the cut in the DNA results in “sticky 
ends”. Another piece of DNA cut on either end by the same 


restriction enzyme could attach to these sticky ends and be 
inserted into the gap made by this cut. 


Because these overhangs are capable of coming back together by hydrogen 
bonding with complementary overhangs on a piece of DNA cut with the 
same restriction enzyme, these are called “sticky ends.” The process of 
forming hydrogen bonds between complementary sequences on single 
strands to form double-stranded DNA is called annealing. Addition of an 
enzyme called DNA ligase, which takes part in DNA replication in cells, 
permanently joins the DNA fragments when the sticky ends come together. 
In this way, any DNA fragment can be spliced between the two ends of a 
plasmid DNA that has been cut with the same restriction enzyme ((link]). 


Foreign DNA Plasmid Restriction site 

Both foreign DNA and a plasmid are 
cut with the same restriction enzyme. 
The restriction site occurs only once 
in the plasmid in the middle of a gene 
for an enzyme (lacZ). 


lacZ gene 


Ampicillin 
resistance 
gene 


The restriction enzyme leaves 
complementary sticky ends on the 
foreign DNA fragment and the 
plasmid. This allows the foreign 
DNA to be inserted into the plasmid 
when the sticky ends anneal. Adding 
DNA ligase reattaches the DNA 
backbones. These are recombinant 
plasmids. 


The plasmids are combined with a 
culture of living bacteria. Many of 

P the bacteria do not take any 
Bacteria may take up plasmids into their cells, many take 
plasmid with or without plasmids that do not have the 

the insert, or may not foreign DNA in them, and a few 
take up plasmid at all. take up the recombinant plasmid. 


The bacteria that take up the 
recombinant plasmid cannot make 
the enzyme from the gene that the 
fragment was inserted into (lacZ). 
They also carry a gene for 
resistance to the antibiotic ampicillin, 
which was on the original plasmid. 


Bacterial genome is 
missing the lacZ gene. 


White colonies To find the bacteria with the recombinant 
have plasmids plasmid, the bacteria are grown on a plate 
with the foreign with the antibiotic ampicillin and a substance 
insert. that changes color when exposed to the 
Blue colonies enzyme produced by the /acZ gene. The 
have plasmids ampicillin will kill any bacteria that did not 
without insert. take up a plasmid. The color of the substance 
will not change when the gene for /acZ 
contains the foreign DNA insert. These are 
the bacteria with the recombinant plasmid 
that we want to grow. 


This diagram shows the steps involved in molecular cloning. 


Plasmids with foreign DNA inserted into them are called recombinant 
DNA molecules because they contain new combinations of genetic 
material. Proteins that are produced from recombinant DNA molecules are 
called recombinant proteins. Not all recombinant plasmids are capable of 
expressing genes. Plasmids may also be engineered to express proteins only 
when stimulated by certain environmental factors, so that scientists can 
control the expression of the recombinant proteins. 


Reproductive Cloning 


Reproductive cloning is a method used to make a clone or an identical 
copy of an entire multicellular organism. Most multicellular organisms 
undergo reproduction by sexual means, which involves the contribution of 
DNA from two individuals (parents), making it impossible to generate an 
identical copy or a clone of either parent. Recent advances in biotechnology 
have made it possible to reproductively clone mammals in the laboratory. 


Natural sexual reproduction involves the union, during fertilization, of a 
sperm and an egg. Each of these gametes is haploid, meaning they contain 
one set of chromosomes in their nuclei. The resulting cell, or zygote, is then 
diploid and contains two sets of chromosomes. This cell divides mitotically 
to produce a multicellular organism. However, the union of just any two 
cells cannot produce a viable zygote; there are components in the cytoplasm 
of the egg cell that are essential for the early development of the embryo 
during its first few cell divisions. Without these provisions, there would be 
no subsequent development. Therefore, to produce a new individual, both a 
diploid genetic complement and an egg cytoplasm are required. The 
approach to producing an artificially cloned individual is to take the egg 
cell of one individual and to remove the haploid nucleus. Then a diploid 
nucleus from a body cell of a second individual, the donor, is put into the 
egg cell. The egg is then stimulated to divide so that development proceeds. 
This sounds simple, but in fact it takes many attempts before each of the 
steps is completed successfully. 


The first cloned agricultural animal was Dolly, a sheep who was born in 
1996. The success rate of reproductive cloning at the time was very low. 


Dolly lived for six years and died of a lung tumor ([link]). There was 
speculation that because the cell DNA that gave rise to Dolly came from an 
older individual, the age of the DNA may have affected her life expectancy. 
Since Dolly, several species of animals (such as horses, bulls, and goats) 
have been successfully cloned. 


There have been attempts at producing cloned human embryos as sources of 
embryonic stem cells. In the procedure, the DNA from an adult human is 
introduced into a human egg cell, which is then stimulated to divide. The 
technology is similar to the technology that was used to produce Dolly, but 
the embryo is never implanted into a surrogate mother. The cells produced 
are called embryonic stem cells because they have the capacity to develop 
into many different kinds of cells, such as muscle or nerve cells. The stem 
cells could be used to research and ultimately provide therapeutic 
applications, such as replacing damaged tissues. The benefit of cloning in 
this instance is that the cells used to regenerate new tissues would be a 
perfect match to the donor of the original DNA. For example, a leukemia 
patient would not require a sibling with a tissue match for a bone-marrow 
transplant. 


Note: 
Art Connection 
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Dolly the sheep was the first 
agricultural animal to be 
cloned. To create Dolly, the 
nucleus was removed from a 
donor egg cell. The 
enucleated egg was placed 
next to the other cell, then 
they were shocked to fuse. 
They were shocked again to 
start division. The cells 
were allowed to divide for 
several days until an early 
embryonic stage was 
reached, before being 
implanted in a surrogate 
mother. 


Why was Dolly a Finn-Dorset and not a Scottish Blackface sheep? 


Genetic Engineering 


Using recombinant DNA technology to modify an organism’s DNA to 
achieve desirable traits is called genetic engineering. Addition of foreign 
DNA in the form of recombinant DNA vectors that are generated by 
molecular cloning is the most common method of genetic engineering. An 
organism that receives the recombinant DNA is called a genetically 
modified organism (GMO). If the foreign DNA that is introduced comes 
from a different species, the host organism is called transgenic. Bacteria, 
plants, and animals have been genetically modified since the early 1970s 
for academic, medical, agricultural, and industrial purposes. These 
applications will be examined in more detail in the next module. 


Note: 
Concept in Action 
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Watch this short video explaining how scientists create a transgenic animal. 


Although the classic methods of studying the function of genes began with 
a given phenotype and determined the genetic basis of that phenotype, 
modern techniques allow researchers to start at the DNA sequence level and 
ask: "What does this gene or DNA element do?" This technique, called 
reverse genetics, has resulted in reversing the classical genetic 
methodology. One example of this method is analogous to damaging a body 
part to determine its function. An insect that loses a wing cannot fly, which 
means that the wing’s function is flight. The classic genetic method 
compares insects that cannot fly with insects that can fly, and observes that 
the non-flying insects have lost wings. Similarly in a reverse genetics 


approach, mutating or deleting genes provides researchers with clues about 
gene function. Alternately, reverse genetics can be used to cause a gene to 
overexpress itself to determine what phenotypic effects may occur. 


Section Summary 


Nucleic acids can be isolated from cells for the purposes of further analysis 
by breaking open the cells and enzymatically destroying all other major 
macromolecules. Fragmented or whole chromosomes can be separated on 
the basis of size by gel electrophoresis. Short stretches of DNA can be 
amplified by PCR. DNA can be cut (and subsequently re-spliced together) 
using restriction enzymes. The molecular and cellular techniques of 
biotechnology allow researchers to genetically engineer organisms, 
modifying them to achieve desirable traits. 


Cloning may involve cloning small DNA fragments (molecular cloning), or 
cloning entire organisms (reproductive cloning). In molecular cloning with 
bacteria, a desired DNA fragment is inserted into a bacterial plasmid using 
restriction enzymes and the plasmid is taken up by a bacterium, which will 
then express the foreign DNA. Using other techniques, foreign genes can be 
inserted into eukaryotic organisms. In each case, the organisms are called 
transgenic organisms. In reproductive cloning, a donor nucleus is put into 
an enucleated egg cell, which is then stimulated to divide and develop into 
an organism. 


In reverse genetics methods, a gene is mutated or removed in some way to 
identify its effect on the phenotype of the whole organism as a way to 
determine its function. 


Art Connections 


Exercise: 


Problem: 


[link] Why was Dolly a Finn-Dorset and not a Scottish Blackface 
sheep? 


Solution: 


[link] Because even though the original cell came from a Scottish 
Blackface sheep and the surrogate mother was a Scottish Blackface, 
the DNA came from a Finn-Dorset. 


Glossary 


anneal 
in molecular biology, the process by which two single strands of DNA 
hydrogen bond at complementary nucleotides to form a double- 
stranded molecule 


biotechnology 
the use of artificial methods to modify the genetic material of living 
organisms or cells to produce novel compounds or to perform new 
functions 


cloning 
the production of an exact copy—specifically, an exact genetic copy— 
of a gene, cell, or organism 


gel electrophoresis 
a technique used to separate molecules on the basis of their ability to 
migrate through a semisolid gel in response to an electric current 


genetic engineering 
alteration of the genetic makeup of an organism using the molecular 
methods of biotechnology 


genetically modified organism (GMO) 
an organism whose genome has been artificially changed 


plasmid 
a small circular molecule of DNA found in bacteria that replicates 
independently of the main bacterial chromosome; plasmids code for 


some important traits for bacteria and can be used as vectors to 
transport DNA into bacteria in genetic engineering applications 


polymerase chain reaction (PCR) 
a technique used to make multiple copies of DNA 


recombinant DNA 
a combination of DNA fragments generated by molecular cloning that 
does not exist in nature 


recombinant protein 
a protein that is expressed from recombinant DNA molecules 


restriction enzyme 
an enzyme that recognizes a specific nucleotide sequence in DNA and 
cuts the DNA double strand at that recognition site, often with a 
staggered cut leaving short single strands or “sticky” ends 


reverse genetics 
a form of genetic analysis that manipulates DNA to disrupt or affect 
the product of a gene to analyze the gene’s function 


reproductive cloning 
cloning of entire organisms 


transgenic 
describing an organism that receives DNA from a different species 


Biotechnology in Medicine and Agriculture (GPC) 
By the end of this section, you will be able to: 


e Describe uses of biotechnology in medicine 
e Describe uses of biotechnology in agriculture 


It is easy to see how biotechnology can be used for medicinal purposes. 
Knowledge of the genetic makeup of our species, the genetic basis of 
heritable diseases, and the invention of technology to manipulate and fix 
mutant genes provides methods to treat diseases. Biotechnology in 
agriculture can enhance resistance to disease, pests, and environmental 
stress to improve both crop yield and quality. 


Genetic Diagnosis and Gene Therapy 


The process of testing for suspected genetic defects before administering 
treatment is called genetic diagnosis by genetic testing. In some cases in 
which a genetic disease is present in an individual’s family, family members 
may be advised to undergo genetic testing. For example, mutations in the 
BRCA genes may increase the likelihood of developing breast and ovarian 
cancers in women and some other cancers in women and men. A woman 
with breast cancer can be screened for these mutations. If one of the high- 
risk mutations is found, her female relatives may also wish to be screened 
for that particular mutation, or simply be more vigilant for the occurrence of 
cancers. Genetic testing is also offered for fetuses (or embryos with in vitro 
fertilization) to determine the presence or absence of disease-causing genes 
in families with specific debilitating diseases. 


Note: 
Concept in Action 
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See how human DNA is extracted for uses such as genetic testing. 


Gene therapy is a genetic engineering technique that may one day be used 
to cure certain genetic diseases. In its simplest form, it involves the 
introduction of a non-mutated gene at a random location in the genome to 
cure a disease by replacing a protein that may be absent in these individuals 
because of a genetic mutation. The non-mutated gene is usually introduced 
into diseased cells as part of a vector transmitted by a virus, such as an 
adenovirus, that can infect the host cell and deliver the foreign DNA into 
the genome of the targeted cell ({link]). To date, gene therapies have been 
primarily experimental procedures in humans. A few of these experimental 
treatments have been successful, but the methods may be important in the 
future as the factors limiting its success are resolved. 
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This diagram shows the steps involved in curing 
disease with gene therapy using an adenovirus 
vector. (credit: modification of work by NIH) 


Production of Vaccines, Antibiotics, and Hormones 


Traditional vaccination strategies use weakened or inactive forms of 
microorganisms or viruses to stimulate the immune system. Modern 
techniques use specific genes of microorganisms cloned into vectors and 
mass-produced in bacteria to make large quantities of specific substances to 
stimulate the immune system. The substance is then used as a vaccine. In 
some cases, such as the H1N1 flu vaccine, genes cloned from the virus have 
been used to combat the constantly changing strains of this virus. 


Antibiotics kill bacteria and are naturally produced by microorganisms such 
as fungi; penicillin is perhaps the most well-known example. Antibiotics are 
produced on a large scale by cultivating and manipulating fungal cells. The 
fungal cells have typically been genetically modified to improve the yields 
of the antibiotic compound. 


Recombinant DNA technology was used to produce large-scale quantities 
of the human hormone insulin in E. coli as early as 1978. Previously, it was 
only possible to treat diabetes with pig insulin, which caused allergic 
reactions in many humans because of differences in the insulin molecule. In 
addition, human growth hormone (HGH) is used to treat growth disorders 
in children. The HGH gene was cloned from a cDNA (complementary 
DNA) library and inserted into E. coli cells by cloning it into a bacterial 
vector. 


Transgenic Animals 


Although several recombinant proteins used in medicine are successfully 
produced in bacteria, some proteins need a eukaryotic animal host for 
proper processing. For this reason, genes have been cloned and expressed in 
animals such as sheep, goats, chickens, and mice. Animals that have been 
modified to express recombinant DNA are called transgenic animals 
([link]). 


It can be seen that two of these 
mice are transgenic because 
they have a gene that causes 

them to fluoresce under a UV 

light. The non-transgenic mouse 
does not have the gene that 


causes fluorescence. (credit: 
Ingrid Moen et al.) 


Several human proteins are expressed in the milk of transgenic sheep and 
goats. In one commercial example, the FDA has approved a blood 
anticoagulant protein that is produced in the milk of transgenic goats for use 
in humans. Mice have been used extensively for expressing and studying 
the effects of recombinant genes and mutations. 


Transgenic Plants 


Manipulating the DNA of plants (creating genetically modified organisms, 
or GMOs) has helped to create desirable traits such as disease resistance, 
herbicide, and pest resistance, better nutritional value, and better shelf life 
({link]). Plants are the most important source of food for the human 
population. Farmers developed ways to select for plant varieties with 
desirable traits long before modern-day biotechnology practices were 
established. 


Corn, a major agricultural 
crop used to create 
products for a variety of 
industries, is often 
modified through plant 
biotechnology. (credit: 
Keith Weller, USDA) 


Transgenic plants have received DNA from other species. Because they 
contain unique combinations of genes and are not restricted to the 
laboratory, transgenic plants and other GMOs are closely monitored by 
government agencies to ensure that they are fit for human consumption and 
do not endanger other plant and animal life. Because foreign genes can 
spread to other species in the environment, particularly in the pollen and 
seeds of plants, extensive testing is required to ensure ecological stability. 
Staples like corn, potatoes, and tomatoes were the first crop plants to be 
genetically engineered. 


Transformation of Plants Using Agrobacterium tumefaciens 


In plants, tumors caused by the bacterium Agrobacterium tumefaciens occur 
by transfer of DNA from the bacterium to the plant. The artificial 
introduction of DNA into plant cells is more challenging than in animal 
cells because of the thick plant cell wall. Researchers used the natural 
transfer of DNA from Agrobacterium to a plant host to introduce DNA 
fragments of their choice into plant hosts. In nature, the disease-causing A. 
tumefaciens have a set of plasmids that contain genes that integrate into the 
infected plant cell’s genome. Researchers manipulate the plasmids to carry 
the desired DNA fragment and insert it into the plant genome. 


The Organic Insecticide Bacillus thuringiensis 


Bacillus thuringiensis (Bt) is a bacterium that produces protein crystals that 
are toxic to many insect species that feed on plants. Insects that have eaten 
Bt toxin stop feeding on the plants within a few hours. After the toxin is 
activated in the intestines of the insects, death occurs within a couple of 
days. The crystal toxin genes have been cloned from the bacterium and 
introduced into plants, therefore allowing plants to produce their own 
crystal Bt toxin that acts against insects. Bt toxin is safe for the environment 
and non-toxic to mammals (including humans). As a result, it has been 
approved for use by organic farmers as a natural insecticide. There is some 
concern, however, that insects may evolve resistance to the Bt toxin in the 
same way that bacteria evolve resistance to antibiotics. 


FlavrSavr Tomato 


The first GM crop to be introduced into the market was the FlavrSavr 
Tomato produced in 1994. Molecular genetic technology was used to slow 
down the process of softening and rotting caused by fungal infections, 
which led to increased shelf life of the GM tomatoes. Additional genetic 
modification improved the flavor of this tomato. The FlavrSavr tomato did 


not successfully stay in the market because of problems maintaining and 
shipping the crop. 


Section Summary 


Genetic testing is performed to identify disease-causing genes, and can be 
used to benefit affected individuals and their relatives who have not 
developed disease symptoms yet. Gene therapy—by which functioning 
genes are incorporated into the genomes of individuals with a non- 
functioning mutant gene—has the potential to cure heritable diseases. 
Transgenic organisms possess DNA from a different species, usually 
generated by molecular cloning techniques. Vaccines, antibiotics, and 
hormones are examples of products obtained by recombinant DNA 
technology. Transgenic animals have been created for experimental 
purposes and some are used to produce some human proteins. 


Genes are inserted into plants, using plasmids in the bacterium 
Agrobacterium tumefaciens, which infects plants. Transgenic plants have 
been created to improve the characteristics of crop plants—for example, by 
giving them insect resistance by inserting a gene for a bacterial toxin. 


Glossary 


gene therapy 
the technique used to cure heritable diseases by replacing mutant genes 
with good genes 


genetic testing 
identifying gene variants in an individual that may lead to a genetic 
disease in that individual 


Biotechnology - Genomics and Proteomics (GPC) 
By the end of this section, you will be able to: 


e Define genomics and proteomics 
¢ Define whole genome sequencing 
e Explain different applications of genomics and proteomics 


The study of nucleic acids began with the discovery of DNA, progressed to 
the study of genes and small fragments, and has now exploded to the field 
of genomics. Genomics is the study of entire genomes, including the 
complete set of genes, their nucleotide sequence and organization, and their 
interactions within a species and with other species. The advances in 
genomics have been made possible by DNA sequencing technology. Just as 
information technology has led to Google Maps that enable us to get 
detailed information about locations around the globe, genomic information 
is used to create similar maps of the DNA of different organisms. 


Mapping Genomes 


Genome mapping is the process of finding the location of genes on each 
chromosome. The maps that are created are comparable to the maps that we 
use to navigate streets. A genetic map is an illustration that lists genes and 
their location on a chromosome. Genetic maps provide the big picture 
(similar to a map of interstate highways) and use genetic markers (similar to 
landmarks). A genetic marker is a gene or sequence on a chromosome that 
shows genetic linkage with a trait of interest. The genetic marker tends to 
be inherited with the gene of interest, and one measure of distance between 
them is the recombination frequency during meiosis. Early geneticists 
called this linkage analysis. 


Physical maps get into the intimate details of smaller regions of the 
chromosomes (similar to a detailed road map) ({link]). A physical map is a 
representation of the physical distance, in nucleotides, between genes or 
genetic markers. Both genetic linkage maps and physical maps are required 
to build a complete picture of the genome. Having a complete map of the 
genome makes it easier for researchers to study individual genes. Human 
genome maps help researchers in their efforts to identify human disease- 


causing genes related to illnesses such as cancer, heart disease, and cystic 
fibrosis, to name a few. In addition, genome mapping can be used to help 
identify organisms with beneficial traits, such as microbes with the ability 
to clean up pollutants or even prevent pollution. Research involving plant 
genome mapping may lead to methods that produce higher crop yields or to 
the development of plants that adapt better to climate change. 
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chromosome. (credit: 
modification of work by 
NCBI, NIH) 


Genetic maps provide the outline, and physical maps provide the details. It 
is easy to understand why both types of genome-mapping techniques are 
important to show the big picture. Information obtained from each 
technique is used in combination to study the genome. Genomic mapping is 
used with different model organisms that are used for research. Genome 
mapping is still an ongoing process, and as more advanced techniques are 
developed, more advances are expected. Genome mapping is similar to 
completing a complicated puzzle using every piece of available data. 
Mapping information generated in laboratories all over the world is entered 
into central databases, such as the National Center for Biotechnology 
Information (NCBI). Efforts are made to make the information more easily 
accessible to researchers and the general public. Just as we use global 
positioning systems instead of paper maps to navigate through roadways, 
NCBI allows us to use a genome viewer tool to simplify the data mining 
process. 


Note: 
Concept in Action 


Online Mendelian Inheritance in Man (QMIM) is a searchable online 
catalog of human genes and genetic disorders. This website shows genome 
mapping, and also details the history and research of each trait and 


disorder. Click the link to search for traits (such as handedness) and genetic 
disorders (such as diabetes). 


Whole Genome Sequencing 


Although there have been significant advances in the medical sciences in 
recent years, doctors are still confounded by many diseases and researchers 
are using whole genome sequencing to get to the bottom of the problem. 
Whole genome sequencing is a process that determines the DNA sequence 
of an entire genome. Whole genome sequencing is a brute-force approach 
to problem solving when there is a genetic basis at the core of a disease. 
Several laboratories now provide services to sequence, analyze, and 
interpret entire genomes. 


In 2010, whole genome sequencing was used to save a young boy whose 
intestines had multiple mysterious abscesses. The child had several colon 
operations with no relief. Finally, a whole genome sequence revealed a 
defect in a pathway that controls apoptosis (programmed cell death). A 
bone marrow transplant was used to overcome this genetic disorder, leading 
to a cure for the boy. He was the first person to be successfully diagnosed 
using whole genome sequencing. 


The first genomes to be sequenced, such as those belonging to viruses, 
bacteria, and yeast, were smaller in terms of the number of nucleotides than 
the genomes of multicellular organisms. The genomes of other model 
organisms, such as the mouse (Mus musculus), the fruit fly (Drosophila 
melanogaster), and the nematode (Caenorhabditis elegans) are now known. 
A great deal of basic research is performed in model organisms because 
the information can be applied to other organisms. A model organism is a 
species that is studied as a model to understand the biological processes in 
other species that can be represented by the model organism. For example, 
fruit flies are able to metabolize alcohol like humans, so the genes affecting 
sensitivity to alcohol have been studied in fruit flies in an effort to 
understand the variation in sensitivity to alcohol in humans. Having entire 
genomes sequenced helps with the research efforts in these model 
organisms ((Link]). 
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Much basic research is done with model 
organisms, such as the mouse, Mus musculus; the 
fruit fly, Drosophila melanogaster; the nematode 
Caenorhabditis elegans; the yeast Saccharomyces 

cerevisiae; and the common weed, Arabidopsis 
thaliana. (credit "mouse": modification of work by 
Florean Fortescue; credit "nematodes": 
modification of work by "snickclunk"/Flickr; 
credit "common weed": modification of work by 
Peggy Greb, USDA; scale-bar data from Matt 
Russell) 


The first human genome sequence was published in 2003. The number of 
whole genomes that have been sequenced steadily increases and now 
includes hundreds of species and thousands of individual human genomes. 


Applying Genomics 


The introduction of DNA sequencing and whole genome sequencing 
projects, particularly the Human Genome Project, has expanded the 
applicability of DNA sequence information. Genomics is now being used in 


a wide variety of fields, such as metagenomics, pharmacogenomics, and 
mitochondrial genomics. The most commonly known application of 
genomics is to understand and find cures for diseases. 


Predicting Disease Risk at the Individual Level 


Predicting the risk of disease involves screening and identifying currently 
healthy individuals by genome analysis at the individual level. Intervention 
with lifestyle changes and drugs can be recommended before disease onset. 
However, this approach is most applicable when the problem arises from a 
single gene mutation. Such defects only account for about 5 percent of 
diseases found in developed countries. Most of the common diseases, such 
as heart disease, are multifactorial or polygenic, which refers to a 
phenotypic characteristic that is determined by two or more genes, and also 
environmental factors such as diet. In April 2010, scientists at Stanford 
University published the genome analysis of a healthy individual (Stephen 
Quake, a scientist at Stanford University, who had his genome sequenced); 
the analysis predicted his propensity to acquire various diseases. A risk 
assessment was done to analyze Quake’s percentage of risk for 55 different 
medical conditions. A rare genetic mutation was found that showed him to 
be at risk for sudden heart attack. He was also predicted to have a 23 
percent risk of developing prostate cancer and a 1.4 percent risk of 
developing Alzheimer’s disease. The scientists used databases and several 
publications to analyze the genomic data. Even though genomic sequencing 
is becoming more affordable and analytical tools are becoming more 
reliable, ethical issues surrounding genomic analysis at a population level 
remain to be addressed. For example, could such data be legitimately used 
to charge more or less for insurance or to affect credit ratings? 


Genome-wide Association Studies 


Since 2005, it has been possible to conduct a type of study called a genome- 
wide association study, or GWAS. A GWAS is a method that identifies 
differences between individuals in single nucleotide polymorphisms (SNPs) 


that may be involved in causing diseases. The method is particularly suited 
to diseases that may be affected by one or many genetic changes throughout 
the genome. It is very difficult to identify the genes involved in such a 
disease using family history information. The GWAS method relies on a 
genetic database that has been in development since 2002 called the 
International HapMap Project. The HapMap Project sequenced the genomes 
of several hundred individuals from around the world and identified groups 
of SNPs. The groups include SNPs that are located near to each other on 
chromosomes so they tend to stay together through recombination. The fact 
that the group stays together means that identifying one marker SNP is all 
that is needed to identify all the SNPs in the group. There are several 
million SNPs identified, but identifying them in other individuals who have 
not had their complete genome sequenced is much easier because only the 
marker SNPs need to be identified. 


In a common design for a GWAS, two groups of individuals are chosen; 
one group has the disease, and the other group does not. The individuals in 
each group are matched in other characteristics to reduce the effect of 
confounding variables causing differences between the two groups. For 
example, the genotypes may differ because the two groups are mostly taken 
from different parts of the world. Once the individuals are chosen, and 
typically their numbers are a thousand or more for the study to work, 
samples of their DNA are obtained. The DNA is analyzed using automated 
systems to identify large differences in the percentage of particular SNPs 
between the two groups. Often the study examines a million or more SNPs 
in the DNA. The results of GWAS can be used in two ways: the genetic 
differences may be used as markers for susceptibility to the disease in 
undiagnosed individuals, and the particular genes identified can be targets 
for research into the molecular pathway of the disease and potential 
therapies. An offshoot of the discovery of gene associations with disease 
has been the formation of companies that provide so-called “personal 
genomics” that will identify risk levels for various diseases based on an 
individual’s SNP complement. The science behind these services is 
controversial. 


Because GWAS looks for associations between genes and disease, these 
studies provide data for other research into causes, rather than answering 


specific questions themselves. An association between a gene difference 
and a disease does not necessarily mean there is a cause-and-effect 
relationship. However, some studies have provided useful information 
about the genetic causes of diseases. For example, three different studies in 
2005 identified a gene for a protein involved in regulating inflammation in 
the body that is associated with a disease-causing blindness called age- 
related macular degeneration. This opened up new possibilities for research 
into the cause of this disease. A large number of genes have been identified 
to be associated with Crohn’s disease using GWAS, and some of these have 
suggested new hypothetical mechanisms for the cause of the disease. 


Pharmacogenomics 


Pharmacogenomics involves evaluating the effectiveness and safety of 
drugs on the basis of information from an individual's genomic sequence. 
Personal genome sequence information can be used to prescribe 
medications that will be most effective and least toxic on the basis of the 
individual patient’s genotype. Studying changes in gene expression could 
provide information about the gene transcription profile in the presence of 
the drug, which can be used as an early indicator of the potential for toxic 
effects. For example, genes involved in cellular growth and controlled cell 
death, when disturbed, could lead to the growth of cancerous cells. 
Genome-wide studies can also help to find new genes involved in drug 
toxicity. The gene signatures may not be completely accurate, but can be 
tested further before pathologic symptoms arise. 


Metagenomics 


Traditionally, microbiology has been taught with the view that 
microorganisms are best studied under pure culture conditions, which 
involves isolating a single type of cell and culturing it in the laboratory. 
Because microorganisms can go through several generations in a matter of 
hours, their gene expression profiles adapt to the new laboratory 
environment very quickly. On the other hand, many species resist being 


cultured in isolation. Most microorganisms do not live as isolated entities, 
but in microbial communities known as biofilms. For all of these reasons, 
pure culture is not always the best way to study microorganisms. 
Metagenomics is the study of the collective genomes of multiple species 
that grow and interact in an environmental niche. Metagenomics can be 
used to identify new species more rapidly and to analyze the effect of 
pollutants on the environment ([link]). Metagenomics techniques can now 
also be applied to communities of higher eukaryotes, such as fish. 


All the genomic DNA from a 
particular environment is 
cut into fragments and 
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DNA from multiple species within 

an environmental niche. The DNA 
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entire genome sequences of 

multiple species to be 

reconstructed from the sequences 
of overlapping pieces. 


Creation of New Biofuels 


Knowledge of the genomics of microorganisms is being used to find better 
ways to harness biofuels from algae and cyanobacteria. The primary 
sources of fuel today are coal, oil, wood, and other plant products such as 
ethanol. Although plants are renewable resources, there is still a need to 
find more alternative renewable sources of energy to meet our population’s 
energy demands. The microbial world is one of the largest resources for 
genes that encode new enzymes and produce new organic compounds, and 
it remains largely untapped. This vast genetic resource holds the potential to 
provide new sources of biofuels ({link]). 


Renewable fuels were 
tested in Navy ships and 
aircraft at the first Naval 

Energy Forum. (credit: 
modification of work by 


John F. Williams, US 
Navy) 


Mitochondrial Genomics 


Mitochondria are intracellular organelles that contain their own DNA. 
Mitochondrial DNA mutates at a rapid rate and is often used to study 
evolutionary relationships. Another feature that makes studying the 
mitochondrial genome interesting is that in most multicellular organisms, 
the mitochondrial DNA is passed on from the mother during the process of 
fertilization. For this reason, mitochondrial genomics is often used to trace 
genealogy. 


Genomics in Forensic Analysis 


Information and clues obtained from DNA samples found at crime scenes 
have been used as evidence in court cases, and genetic markers have been 
used in forensic analysis. Genomic analysis has also become useful in this 
field. In 2001, the first use of genomics in forensics was published. It was a 
collaborative effort between academic research institutions and the FBI to 
solve the mysterious cases of anthrax ([link]) that was transported by the 
US Postal Service. Anthrax bacteria were made into an infectious powder 
and mailed to news media and two U.S. Senators. The powder infected the 
administrative staff and postal workers who opened or handled the letters. 
Five people died, and 17 were sickened from the bacteria. Using microbial 
genomics, researchers determined that a specific strain of anthrax was used 
in all the mailings; eventually, the source was traced to a scientist at a 
national biodefense laboratory in Maryland. 


Bacillus anthracis is the 
organism that causes anthrax. 
(credit: modification of work 
by CDC; scale-bar data from 

Matt Russell) 


Genomics in Agriculture 


Genomics can reduce the trials and failures involved in scientific research 
to a certain extent, which could improve the quality and quantity of crop 
yields in agriculture ([link]). Linking traits to genes or gene signatures helps 
to improve crop breeding to generate hybrids with the most desirable 
qualities. Scientists use genomic data to identify desirable traits, and then 
transfer those traits to a different organism to create a new genetically 
modified organism, as described in the previous module. Scientists are 
discovering how genomics can improve the quality and quantity of 
agricultural production. For example, scientists could use desirable traits to 
create a useful product or enhance an existing product, such as making a 
drought-sensitive crop more tolerant of the dry season. 


Transgenic agricultural plants can 
be made to resist disease. These 
transgenic plums are resistant to 
the plum pox virus. (credit: Scott 

Bauer, USDA ARS) 


Proteomics 


Proteins are the final products of genes that perform the function encoded 
by the gene. Proteins are composed of amino acids and play important roles 
in the cell. All enzymes (except ribozymes) are proteins and act as catalysts 
that affect the rate of reactions. Proteins are also regulatory molecules, and 
some are hormones. Transport proteins, such as hemoglobin, help transport 
oxygen to various organs. Antibodies that defend against foreign particles 
are also proteins. In the diseased state, protein function can be impaired 
because of changes at the genetic level or because of direct impact on a 
specific protein. 


A proteome is the entire set of proteins produced by a cell type. Proteomes 
can be studied using the knowledge of genomes because genes code for 
mRNAs, and the mRNAs encode proteins. The study of the function of 
proteomes is called proteomics. Proteomics complements genomics and is 


useful when scientists want to test their hypotheses that were based on 
genes. Even though all cells in a multicellular organism have the same set 
of genes, the set of proteins produced in different tissues is different and 
dependent on gene expression. Thus, the genome is constant, but the 
proteome varies and is dynamic within an organism. In addition, RNAs can 
be alternatively spliced (cut and pasted to create novel combinations and 
novel proteins), and many proteins are modified after translation. Although 
the genome provides a blueprint, the final architecture depends on several 
factors that can change the progression of events that generate the 
proteome. 


Genomes and proteomes of patients suffering from specific diseases are 
being studied to understand the genetic basis of the disease. The most 
prominent disease being studied with proteomic approaches is cancer 
({link]). Proteomic approaches are being used to improve the screening and 
early detection of cancer; this is achieved by identifying proteins whose 
expression is affected by the disease process. An individual protein is called 
a biomarker, whereas a set of proteins with altered expression levels is 
called a protein signature. For a biomarker or protein signature to be 
useful as a candidate for early screening and detection of a cancer, it must 
be secreted in body fluids such as sweat, blood, or urine, so that large-scale 
screenings can be performed in a noninvasive fashion. The current problem 
with using biomarkers for the early detection of cancer is the high rate of 
false-negative results. A false-negative result is a negative test result that 
should have been positive. In other words, many cases of cancer go 
undetected, which makes biomarkers unreliable. Some examples of protein 
biomarkers used in cancer detection are CA-125 for ovarian cancer and 
PSA for prostate cancer. Protein signatures may be more reliable than 
biomarkers to detect cancer cells. Proteomics is also being used to develop 
individualized treatment plans, which involves the prediction of whether or 
not an individual will respond to specific drugs and the side effects that the 
individual may have. Proteomics is also being used to predict the possibility 
of disease recurrence. 


This machine is preparing to do a 
proteomic pattern analysis to 
identify specific cancers so that an 
accurate cancer prognosis can be 
made. (credit: Dorie Hightower, 
NCI, NIH) 


The National Cancer Institute has developed programs to improve the 
detection and treatment of cancer. The Clinical Proteomic Technologies for 
Cancer and the Early Detection Research Network are efforts to identify 
protein signatures specific to different types of cancers. The Biomedical 
Proteomics Program is designed to identify protein signatures and design 
effective therapies for cancer patients. 


Section Summary 


Genome mapping is similar to solving a big, complicated puzzle with 
pieces of information coming from laboratories all over the world. Genetic 
maps provide an outline for the location of genes within a genome, and they 
estimate the distance between genes and genetic markers on the basis of the 
recombination frequency during meiosis. Physical maps provide detailed 
information about the physical distance between the genes. The most 
detailed information is available through sequence mapping. Information 


from all mapping and sequencing sources is combined to study an entire 
genome. 


Whole genome sequencing is the latest available resource to treat genetic 
diseases. Some doctors are using whole genome sequencing to save lives. 
Genomics has many industrial applications, including biofuel development, 
agriculture, pharmaceuticals, and pollution control. 


Imagination is the only barrier to the applicability of genomics. Genomics 
is being applied to most fields of biology; it can be used for personalized 
medicine, prediction of disease risks at an individual level, the study of 
drug interactions before the conduction of clinical trials, and the study of 
microorganisms in the environment as opposed to the laboratory. It is also 
being applied to the generation of new biofuels, genealogical assessment 
using mitochondria, advances in forensic science, and improvements in 
agriculture. 


Proteomics is the study of the entire set of proteins expressed by a given 
type of cell under certain environmental conditions. In a multicellular 
organism, different cell types will have different proteomes, and these will 
vary with changes in the environment. Unlike a genome, a proteome is 
dynamic and under constant flux, which makes it more complicated and 
more useful than the knowledge of genomes alone. 


Glossary 


biomarker 
an individual protein that is uniquely produced in a diseased state 


genetic map 
an outline of genes and their location on a chromosome that is based 
on recombination frequencies between markers 


genomics 
the study of entire genomes, including the complete set of genes, their 
nucleotide sequence and organization, and their interactions within a 
species and with other species 


metagenomics 
the study of the collective genomes of multiple species that grow and 
interact in an environmental niche 


model organism 
a species that is studied and used as a model to understand the 
biological processes in other species represented by the model 
organism 


pharmacogenomics 
the study of drug interactions with the genome or proteome; also called 
toxicogenomics 


physical map 
a representation of the physical distance between genes or genetic 
markers 


protein signature 
a set of over- or under-expressed proteins characteristic of cells ina 
particular diseased tissue 


proteomics 
study of the function of proteomes 


whole genome sequencing 
a process that determines the nucleotide sequence of an entire genome 


